New insights into the domain of unknown function (DUF) of EccC5, the pivotal ATPase providing the secretion driving force to the ESX-5 secretion system

The crystal structure of the DUF domain of EccC5 from Mycobacterium tuberculosis, a degenerated ATPase domain with potential implications in the opening and closure of the membrane pore in the M. tuberculosis ESX-5 secretion system, is reported.

Type VII secretion (T7S) systems, also referred to as ESAT-6 secretion (ESX) systems, are molecular machines that have gained great attention due to their implications in cell homeostasis and in host-pathogen interactions in mycobacteria.The latter include important human pathogens such as Mycobacterium tuberculosis (Mtb), the etiological cause of human tuberculosis, which constitutes a pandemic accounting for more than one million deaths every year.The ESX-5 system is exclusively found in slow-growing pathogenic mycobacteria, where it mediates the secretion of a large family of virulence factors: the PE and PPE proteins.The secretion driving force is provided by EccC 5 , a multidomain ATPase that operates using four globular cytosolic domains: an N-terminal domain of unknown function (EccC 5 DUF ) and three FtsK/SpoIIIE ATPase domains.Recent structural and functional studies of ESX-3 and ESX-5 systems have revealed EccC DUF to be an ATPase-like fold domain with potential ATPase activity, the functionality of which is essential for secretion.Here, the crystal structure of the MtbEccC 5 DUF domain is reported at 2.05 A ˚resolution, which reveals a nucleotide-free structure with degenerated cis-acting and transacting elements involved in ATP binding and hydrolysis.This crystallographic study, together with a biophysical assessment of the interaction of MtbEccC 5 DUF with ATP/Mg 2+ , supports the absence of ATPase activity proposed for this domain.It is shown that this degeneration is also present in DUF domains from other ESX and ESX-like systems, which are likely to exhibit poor or null ATPase activity.Moreover, based on an in silico model of the N-terminal region of MtbEccC 5

Introduction
Molecular machines are sophisticated protein complexes that are ubiquitously present in all cellular organisms (Miller & Enemark, 2016).They convert the chemical energy resulting from the hydrolysis of nucleoside triphosphates (NTPs) into the mechanical force needed in numerous cellular events such as DNA replication, protein degradation, cell motility and protein secretion, amongst others (Miller & Enemark, 2016;Schmidt et al., 2012;Famelis et al., 2023;Crosskey et al., 2020).The members of the superfamily of ATPases associated with diverse cellular activities (AAA+ proteins) are key constituents of many molecular machines, where the mechanical work is performed at the expense of adenosine triphosphate (ATP) hydrolysis (Miller & Enemark, 2016;Leipe et al., 2002).Type VII secretion systems (T7SSs), also referred to as ESAT-6 secretion (ESX) systems, are AAA+-dependent molecular machines that are found in the Actinomycetota phylum and have gained great attention due to their implications in cell homeostasis and host-pathogen interactions in mycobacteria (Famelis et al., 2023;Bunduc et al., 2020;Houben et al., 2014).The latter include devastating pathogenic species such as Mycobacterium tuberculosis (Mtb), the etiological cause of human tuberculosis (TB), which constitutes a pandemic that is responsible for more than one million deaths every year (World Health Organization, 2022).T7SSs are also found in bacteria belonging to the Firmicutes phylum (T7SSb), which include other important pathogens such as Staphylococcus aureus, Listeria monocytogenes, Bacillus anthracis and B. subtilis (Zoltner et al., 2016;Mietrach et al., 2020).Of note, T7Sb systems are distantly related to the T7Sa systems found in actinomycetes, thus constituting a divergent type of secretion systems.
Mycobacteria produce up to five ESX secretion systems (T7SSas), named ESX-1 to ESX-5, which consist of large membrane complexes that span the inner bacterial membrane.Among them, ESX-5 is found almost exclusively in slowgrowing pathogenic mycobacteria, where it participates in nutrient uptake, intracellular colonization and modulation of the immune response during infection through the secretion of a large family of protein effectors: the PE and PPE proteins (Bunduc et al., 2020;Houben et al., 2014).These roles have important implications in the life cycle and virulence of the pathogens, thus indicating ESX-5 as a potential drug target (Bunduc et al., 2021).Efforts aiming at structural-functional characterization of mycobacterial ESX secretion systems have resulted in structures of ESX-5 from Mtb and M. xenopi (Mxp) and of ESX-3 from M. smegmatis (Msm), which have illuminated the architecture of the pore complex (Famelis et al., 2019;Poweleit et al., 2019;Bunduc et al., 2021;Beckham et al., 2021;Fig. 1a).The latter can be defined as a hexamer of protomers, in which each protomer is formed by four different membrane proteins, EccB:EccC:EccD:EccE, interacting in a 1:1:2:1 stoichiometry.EccC is a pivotal component of the pore complex, providing the secretion driving force via the hydrolysis of ATP (Famelis et al., 2019;Poweleit et al., 2019;Bunduc et al., 2021;Beckham et al., 2021).In the ESX-2 to ESX-5 systems, EccC consists of two N-terminal transmembrane (TM) helices connected to two additional helices (the stalk region) and a domain of unknown function (EccC DUF ), which are followed by three FtsK/SpoIIIE AAA+ ATPase domains, here referred to as D1, D2 and D3 (EccC D1-D3 ; Famelis et al., 2023;Bunduc et al., 2021;Fig. 1b).This multi-domain organization is also shared by EccC orthologues (EssC) found in Firmicutes (Mietrach et al., 2020), which differ in ESX-1, where EccC is replaced by two functionally equivalent fragments: EccC 1 a, containing DUF and D1 domains, and EccC 1 b, containing D2 and D3 domains (Famelis et al., 2023;Houben et al., 2014).EccC/EssC enzymes are expected to operate through a mechanism involving their hexamerization.In line with this, the cytosolic region of MtbEccC 5 has been observed to transit from an extended/open state to a contracted/closed state in which the enzyme multimerizes to form the cytosolic chamber that is expected to accommodate the effectors to be secreted (Bunduc et al., 2021;Fig. 1a).
The D1-D3 domains exhibit the three-layer �-�-� core structure typically found in prokaryotic AAA+ enzymes, which is frequently followed by a C-terminal �-helical lid domain that is missing in EccC/EssC enzymes (Zoltner et al., 2016).The hexameric architecture observed in the ESX-3 and ESX-5 complexes is in line with the functional oligomeric state frequently found in prokaryotic AAA+ enzymes.The latter include the FtsK and SpoIIIE ATPases, which are close relatives of EccC and are reported to function as hexamers, and in which the ATPase sites are located at the interface between two adjacent protomers (Miller & Enemark, 2016;Leipe et al., 2002;Bunduc et al., 2021;Rosenberg et al., 2015).In such cases, both subunits provide the cis-and trans-acting catalytic elements required to form the active site.The former includes the Walker A motif, the Walker B motif and Sensor 1, which are typically located in the loop connecting �1 and �1, the Cterminus of �3 and the loop connecting �4 and �4, respectively.The Walker A motif, also referred to as the P-loop, consists of a glycine-rich sequence (G 1 xxG 2 xG 3 K[S/T] in FtsK homologues) that is involved in stabilization of the phosphate group of the nucleotide via highly conserved lysine and serine/ threonine residues that enable the correct orientation of the �-phosphate required for ATP hydrolysis (Miller & Enemark, 2016;Leipe et al., 2002).The Walker B motif provides two acidic residues (hhhhDE in FtsK homologues, where h is any hydrophobic amino acid), the second of which acts as the catalytic base that activates the water involved in nucleophilic attack on the �-phosphate (Miller & Enemark, 2016;Leipe et al., 2002).Sensor 1 is a key catalytic element consisting of a polar residue, which is typically an asparagine but can also be a serine, threonine or aspartate.It is located between the Walker motifs to assist in the correct orientation of the catalytic glutamate towards the nucleophilic water.Trans-acting elements are positively charged amino acids, mostly (but not exclusively) arginine residues, and thus are referred to as arginine fingers or sensors (Miller & Enemark, 2016;Leipe et al., 2002).The latter include the Arg finger, which consists of an arginine (or lysine) residue located at the end of the �4 helix that orients towards the neighbouring active site to form contacts with the �-phosphate.Through this interaction, the Arg finger favours the transition state for hydrolysis, thus playing an important role in ATP hydrolysis and, in some cases, in enzyme multimerization (Miller & Enemark, 2016;Ogura et al., 2004).The other two trans-acting motifs in AAA + enzymes are Sensors 2 and 3, which are generally arginines that are involved in sensing and/or stabilizing the ATP/ADPbound states and promoting conformational changes coupled to these states or nucleotide hydrolysis (Miller & Enemark, 2016;Li et al., 2015).
The first structural and functional studies conducted by Rosenberg and coworkers on EccC from Thermomonospora curvata (TcrEccC) showed that the D1 domain is an active ATPase that exhibits fully conserved Walker A (GxxGxGK[S/ T]) and Walker B (hhhhDE) motifs with respect to FtsKs as a close canonical orthologue, including the Arg-finger transacting motif (Rosenberg et al., 2015).This and subsequent studies on EccCs/EssCs from Mtb and S. aureus showed that the Walker A and B motifs are, however, degenerated in the D2 and D3 domains, which aligns with the nondetectable or poor catalytic activities observed for these domains, which have been proposed to play a regulatory role (Rosenberg et al., 2015;Wang et al., 2020;Zoltner et al., 2016).More recently, structural characterization of the MsmESX3 complex revealed that the N-terminal domain of unknown function (DUF) exhibits an ATPase-like fold, as also observed in the subsequently reported DUF domains of the MtbESX-5 and MxpESX-5 complexes (Famelis et al., 2019;Bunduc et al., 2021;Beckham et al., 2021).The authors described the presence of a Walker B motif (hhhhDD 320 ) in MsmEccC DUF (hhhhDE), and how the mutation of each aspartate to alanine obliterates secretion.These observations highlight the essentiality of MsmEccC DUF 3 in secretion and point to a potential ATPase activity of this domain (Famelis et al., 2019).The structure of MsmEccC DUF 3 is however devoid of ATP, analogously to the MtbEccC DUF 5 and MxpEccC DUF 5 structures, which exhibit valine and isoleucine residues, respectively, that replace the catalytic amino acid (Famelis et al., 2019;Bunduc et al., 2021;Beckham et al., 2021).These observations expose a variable degeneration of the Walker B motif across different EccC DUF domains that may impair the ATPase activity to differing extents.In addition, the MsmEccC DUF 3 , MtbEccC DUF 5 and MxpEccC DUF 5 domain structures reveal a noncanonical arrangement of the Walker A motif lacking the typical P-loop structure, which is replaced by an extended �-helical conformation.Interestingly, this arrangement is also observed in the structure of EssC D3 from S. aureus (SrsEssC D3 ; PDB entry 6tv1), which is also devoid of ATP and exhibits an ATP-binding affinity in the low-millimolar range (Mietrach et al., 2020).
The structural and functional information available for T7SSs leaves outstanding questions regarding the molecular basis underlying the ATPase activity proposed for the DUF domain.In this regard, here we report the crystallographic structure of the MtbEccC DUF 5 domain at 2.05 A ˚resolution, which provides an unambiguous model showing a nucleotidefree structure with degenerated cis-acting and trans-acting elements involved in ATP binding and hydrolysis.Our highresolution structure, together with a biophysical assessment of the interaction of MtbEccC DUF 5 with ATP/Mg 2+ in vitro, supports the absence of ATPase activity in this domain.These results are in line with an in silico analysis carried out in other EssC and EccC enzymes, which reveals the presence of degenerated DUF domains in other mycobacterial and nonmycobacterial T7S systems that are likely to exhibit null or deficient ATPase activity.These findings suggest that DUF domains play a different role in the secretion process that, based on an in silico model of MtbEccC 5 and the mutagenesis studies reported for MsmESX3 (Famelis et al., 2019), we propose it may be related to the aperture of the membranepore complex during the secretion process.

Protein expression and purification
A protein construct containing the DUF domain of EccC 5 from M. tuberculosis (MtbEccC DUF 5 , residues 1-417; UniProt entry P9WNA5) was designed for recombinant production.Residues 17-118 and 167-198 were replaced by GSSG and GSG sequences, respectively, in order to remove the transmembrane (TM) and stalk regions and enable production as a soluble protein (see Fig. 2 and Supplementary Fig. S1).The construct contained an N-terminal 6�His tag followed by a SUMO tag and a PreScission 3C cleavage site.The DNA coding for the construct cloned into a pET-28a plasmid (between NcoI and XhoI restriction sites) was purchased from GenScript with codon optimization for Escherichia coli expression (pET-EccC DUF

5
).Expression of EccC DUF 5 was performed in E. coli BL21 (DE3) Star (Invitrogen) cells using 2�TY medium supplemented with kanamycin (50 mg ml À 1 ).The cells were grown at 37 � C to an OD 600 nm of 0.8 and then cooled to 16 � C for an hour.Expression was then induced at 16 � C by the addition of 1 mM isopropyl �-d-1-thiogalactopyranoside for 20 h.The cells were harvested by centrifugation at 5251g for 20 min at 4 � C and the cell pellets were stored at À 20 � C.
The cell pellets were thawed and resuspended in 20 mM Tris-HCl pH 7.5, 500 mM NaCl, 10 mM imidazole (buffer A) supplemented with 5 mM MgCl 2 , 1 mM MnCl 2 and 1 mg ml À 1 DNAse (Sigma).This step was performed at room temperature, while subsequent purification steps were carried out at 4 � C. The cells were disrupted by sonication and the clarified extract was loaded into a 1 ml HisTrap HP column (Cytiva) previously equilibrated in buffer A. Protein samples were eluted using an imidazole gradient (from 10 to 500 mM) and then buffer-exchanged to 20 mM Tris-HCl pH 7.5, 150 mM NaCl (buffer B) using a PD-10 desalting column (Cytiva).Samples were diluted to 1 mg ml À 1 in buffer B prior to overnight cleavage with 3C protease.The cleaved protein was concentrated using Amicon 10 kDa molecular-weight cutoff concentrators (Millipore) and further purified by size-exclusion chromatography (SEC) using a Superdex 200 (16/60) column (Cytiva) pre-equilibrated with buffer B or 20 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl 2 (buffer C) for crystallographic or ITC experiments, respectively.Pure protein fractions obtained from SEC were pooled, concentrated and flash-frozen in liquid nitrogen for storage at À 80 � C. Protein sample purity and quantification were assessed by SDS-PAGE and UV-Vis absorbance at 280 nm, respectively.

Protein crystallization, structure resolution and analysis
All crystallization experiments were carried out using the sitting-drop vapour-diffusion method at 20 � C in 96-well MRC plates (Hampton Research) employing an Oryx8 robot (Douglas Instruments).Initial crystals of EccC DUF 5 were obtained in condition F2 of the Index screen.After optimization, the best crystals grew in 300 nl droplets formed by mixing 150 nl protein solution at 10 mg ml À 1 and 150 nl precipitant condition [25%(w/v) PEG MME 2K, 300 mM trimethylamine N-oxide, Tris-HCl pH 8.0].The crystals were flash-cooled in liquid nitrogen without cryoprotection for X-ray data collection, which enabled the measurement of a complete X-ray data set to 2.05 A ˚resolution at 100 K using synchrotron radiation at ALBA, Barcelona, Spain.The data set was indexed and integrated with XDS (Kabsch, 2010) and scaled and reduced using AIMLESS (Evans & Murshudov, 2013).The crystals belonged to space group P1, with unit-cell parameters a = 48.30,b = 52.24,c = 57.44A ˚, � = 88.39,� = 84.60,� = 80.17 � (Table 1).Structure solution was carried out using the molecular-replacement method using Phaser (McCoy et al., 2007) with the coordinates of the EccC DUF unambiguous solution was obtained, which consisted of two EccC DUF 5 molecules in the asymmetric unit.The initial model was subjected to alternate cycles of model building with Coot (Emsley et al., 2010) and refinement with Phenix (Liebschner et al., 2019) and BUSTER (Bricogne et al., 2017) using NCS restrictions.The electron-density maps allowed us to model both molecules of the asymmetric unit almost completely along with 287 water molecules.Residues 1-16-GSSG-119-123 (in chains A and B), 275-283 and 311-315 (in chain A) and 274-283 (in chain B) were not modelled due to poor electron density being observed for these residues.The geometry of the final model was validated using MolProbity (Chen et al., 2010).Dihedral angles were analysed with Coot (Emsley et al., 2010) and Mogul (version 2020.3.0).Figures were generated using PyMOL (version 2.0; Schro ¨dinger).The atomic coordinates and structure factors have been deposited in the Protein Data Bank under accession code 8rin.

Differential scanning fluorimetry
Differential scanning fluorimetry (DSF) assays were carried out to estimate the inflection temperature (T i ) associated with unfolding transitions of the EccC DUF 5 construct in the absence and presence of nucleotide/Mg 2+ .The experiments consisted of monitoring the variation in the emission of fluorescence by buried and exposed tryptophan residues at wavelengths of 350 and 330 nm, respectively, using a Tycho NT.6 (NanoTemper).DSF experiments were set up in a final volume of 10 ml buffer C containing the EccC DUF 5 construct at 2 mM in the absence or presence of ADP-AlF 3 at 4 mM.Mixtures of protein with the ATP analogue were incubated for 15 min before DSF measurements.Graph representations and analysis were performed using the Tycho analysis interface.Three independent runs were used in each case to calculate mean T i values and the corresponding standard deviations.

Isothermal titration calorimetry
The ATP-binding and ATPase activities of the EccC DUF 5 domain were assessed by isothermal titration calorimetry (ITC) using an ITC-VP instrument from GE Instruments.EccC DUF 5 at 61 mM was loaded into the cell at 25 � C and was titrated with ATP at 1.3 mM (in the syringe) using 15 injections, firstly of 1 ml, then of 10 ml and subsequently of 20 ml at intervals of 5 min.Both protein and ATP were prepared in buffer C. Analysis of the results was carried out and final figures were generated using the Origin 7 ITC software.

Primary-, secondary-and tertiary-structure analysis in silico
A comparative analysis of the putative ATP/Mg 2+ -binding site across DUF domains and comparison with canonical ATPases was performed by Clustal Omega multi-sequence alignment (Sievers et al., 2011)    .Feasible predicted interfaces were analysed using jsPISA (Krissinel, 2015) in order to identify inter-monomer/domain contacts, which were subsequently inspected and validated in PyMOL.jsPISA was also used to analyse the predicted interfaces using the interaction radar score, which estimates the likelihood of a biological ensemble based on a statistical analysis of all interfaces found in the PDB.The interaction radar is divided into probability circles (0-100%).The larger the area, the higher the probability of the interface observed between Glu228 and Gln229 in the former compared with the nonfavourable dihedral angle generated between Ser470 and Gly471 of PrgFtsK when the glycine is substituted by an alanine.having biological significance, and area values fitting or above the 50% probability circle are likely to be part of a biological ensemble (Krissinel, 2015).

High-resolution structure of the EccC 5 DUF domain
The overall crystallographic structure of the MtbEccC DUF 5 domain (Fig. 2a) consists of an �-�-� sandwich that is highly homologous to the �/� fold found in the core of AAA+ domains.The core structure consists of a six-stranded parallel �-sheet (�5-�1-�4-�3-�2-�2 0 ) further extended by an additional two-stranded antiparallel sheet (�6-�7).This �-sheet core is flanked by two helices (�1 and �2) on one side and four helices (�2 0 A, �2 0 B, �3 and �4) on the other side.The C-terminal �-helical lid present in other prokaryotic AAA+ domains is missing in the MtbEccC DUF 5 structure, which is flanked by two parallel helices (�A and �B) and a twostranded antiparallel sheet (�A-�B) that connects to the stalk domain.This structure is conserved in both MtbEccC DUF 5 molecules present in the asymmetric unit (r.m.s.d. of 0.173 A for 218 C � atoms).Superimposition of the crystallographic and cryo-EM (MtbEccC CRYO 5 ; PDB entry 7npr; Bunduc et al., 2021) models shows overall structure conservation (r.m.s.d. of 0.671 A ˚for 198 C � atoms), with the only significant differences in the N-terminal region (residues 1-16 are not defined in the electron-density maps).However, despite the overall conservation of the fold, other differences between our structure and MtbEccC CRYO 5 are found, as will be described below.The proposed ATP-binding site should be located at the N-terminus of the �1 helix and the C-terminus of the �3 strand, which are expected to contain the P-loop and the Walker B motif, respectively.Sequence analysis of the Walker A motif shows a highly degenerated and one amino acid shorter sequence (GEREQVL231) that is missing the conserved G 2 and G 3 glycines as well as the Lys and Thr/Ser residues that are key for ATP/Mg 2+ stabilization (Val30 and Leu231 in MtbEccC DUF 5 ; Fig. 2b).Inspection of the MtbEccC DUF 5 crystal model shows an ATP-free structure in which the degenerated Walker A motif folds as an extended �1 helix instead of the standard P-loop conformation.In this arrangement, �1 would clash with the nucleotide molecule, as revealed by structural superimposition of MtbEccC DUF 5 with FtsK from P. aeruginosa (PrgFtsK), a closely homologous canonical ATPase.To understand the structural basis of this extended �1 structure, we investigated the impact of G 2 and G 3 variations in MtbEccC DUF 5 , as invariant flexible features involved in both turns of the P-loop motif.To do so, we modelled the replacement of G 3 by alanine in PrgFtsK (G471A) in silico, which shows how the newly added C � gives rise to an unfavoured dihedral angle C i -N i+1 -C � iþ1 -C � iþ1 (� = 49 � ) between Ser470 and Ala471 (Fig. 2d).This is supported by the low occurrence of this torsion angle when searching the CCDC database (Supplementary Fig. S1).This disfavoured conformation is prevented in MtbEccC DUF 5 by a flip of the Glu228-Gln229 peptide bond, leading to the extended �-helical structure observed in �1 (Fig. 2d).Analogously, the absence of G 2 should foster an �-helical conformation to avoid unfavourable dihedral angles between the two adjacent nonglycine amino acids.Altogether, this points to variations of G 2 and G 3 as structural factors responsible for the extended �1 helix observed in MtbEccC DUF

5
. Inspection of the Walker B motif also shows conformational deviations in MtbEccC DUF 5 , which include a shorter �3-�3 loop and a hydrophobic residue (Val329) replacing the catalytic base (Fig. 2a).Val329 rests partially buried between the �3-�3 and �4-�4 loops in a conformation that is well stabilized by a hydrogen-bond network with nearby residues.This drags the Walker B motif away from the P-loop by >2 A ˚compared with PrgFtsK, thus providing the room necessary to accommodate the extended �1 helix.This noncanonical arrangement of the Walker motifs is further stabilized by numerous direct and water-mediated interactions.These include a hydrogen bond between the sidechain amide group of Gln229 and the main-chain N atom of Glu226 and the formation of a salt bridge between the side chains of Arg227 and Asp328, which stabilize the extended �1 structure and are well defined in the electron-density maps (Supplementary Fig. S1).Moreover, the short bond distances measured for the salt-bridge interaction indicate that this is a strong contact that, overall, should further challenge the structural rearrangement of this nonproductive conformation observed in �1 for ATP binding.Superimposition of the crystal and cryo-EM structures of MtbEccC 5 shows overall conservation of the noncanonical arrangement observed in the Walker motifs in both models (Supplementary Fig. S1).However, significant differences were found in the structural features that shape the architecture of the degenerated ATPbinding site, supported by the electron-density and cryo-EM maps (Supplementary Fig. S1).These comprise many direct or water-mediated hydrogen bonds that stabilize the Walker and Sensor 1 motifs, including an interaction between Gln229 and Glu226 that is too long to form a hydrogen bond in MtbEccC CRYO 5 and a water-mediated contact between the side chains of Arg227 and Asp258, which seems to assist the side chain of Arg227 in adopting an optimal orientation to interact with Asp328 and was not observed in the cryo-EM model.Moreover, in contrast to MtbEccC CRYO 5 , the crystal structure of MtbEccC DUF 5 shows a bidentate salt bridge between the side chains of Arg27 and Asp328 that is well defined in the electron-density maps and contributes to optimize the interactions that stabilize the noncanonical configuration of the Walker A motif (Supplementary Fig. S1).Thus, our structure provides an unambiguous model defining the noncanonical ATP-binding site of MtbEccC 5 at high resolution, which has allowed us to acquire a detailed description and understanding of the key structural features contributing to its degenerated arrangement.
We next analysed the regions expected to contain the transacting elements in MtbEccC DUF (Supplementary Fig. S1).No basic residues were observed at the end of �4 as candidates to act as an Arg finger.Only Arg362 was found in the vicinity, and was further down the �4 sequence.However, its quite buried position in the structure makes it a poor candidate to function as an Arg finger.The helical C-terminal lid, which is habitually found in prokaryotic AAA+ clades and should contain Sensor 2, is missing in MtbEccC DUF 5 .Besides, inspection of the �3 helix shows no basic residues in this region to form a potential Sensor 3. Taking all these observations together, our structural analysis of MtbEccC DUF 5 shows the degeneration and absence of cis-and trans-acting elements, respectively, in MtbEccC DUF 5 , which would support the lack of ATPase activity in this domain.These observations are in line with our failed attempts to co-crystallize MtbEccC DUF 5 with ATP/Mg 2+ , with structures devoid of nucleotide and magnesium instead being obtained.

ATP-binding and ATP-hydrolysis analysis
The structural arrangement of the putative ATP-binding site of MtbEccC DUF 5 resembles the nucleotide-binding pocket observed in the structure of the SrsEssC D3 domain (PDB entry 6tv1), which has been reported to bind ATP with an affinity in the low-millimolar range (Mietrach et al., 2020;Fig. 3a).This poor but detectable binding points to potential structural rearrangements of the ATP-binding site of SrsEssC D3 that enable interaction with the nucleotide.Therefore, we analysed the interaction of MtbEccC DUF 5 with ATP/Mg 2+ in order to explore a possible reorganization of the ATP-binding site that might enable nucleotide binding and hydrolysis in solution using two different biophysical approaches.
Firstly, the interaction of MtbEccC DUF 5 with ADP-AlF 3 (used as an ATP analogue) was assessed by DSF in the presence of magnesium.Despite the high ADP-AlF 3 / EccC DUF 5 concentration ratio used (�4 mM/2 mM), the experiments showed that the difference between T i values in the presence (59.13 � 0.04 � C) and absence (60.46 � 0.04 � C) of nucleotide was slightly negative (around À 1.3 � C) (Fig. 3b) and very close to the error limit of the device, thus pointing to an absence of binding.Nucleotide interaction and ATP hydrolysis were next assessed by ITC in the presence of magnesium.The heat released upon the titration of ATP into MtbEccC DUF 5 was comparable to the thermal effect of ATP dilution in the range of concentrations tested (Fig. 3c).Additionally, no thermal power deflection (proportional to the potential substrate concentration in the cell) was observed after each ATP injection, as would be expected if the nucleotide was being hydrolysed (Mene ´ndez, 2020).Overall, the ITC and DSF results support the lack of ATPase activity of MtbEccC DUF 5 .

The nucleotide-binding site in other EccC DUF domains
Given the structural degeneration that was observed in the MtbEccC DUF 5 structure, and its observed inability to interact with ATP using X-ray crystallography, DSF and ITC, we decided to examine the putative ATP-binding site in other DUF domains.With this purpose, homologous EccC DUF domains and orthologous EssC DUF domains from myco- bacterial and nonmycobacterial species were used in sequence alignment (Fig. 4a), which showed degeneration of the Walker A motif in all of the domains inspected.Analogously to MtbEccC DUF 5 , these motifs present a sequence that is one amino acid shorter, in which the conserved residues G 2 and G 3 are missing and substituted by a nonglycine amino acid, respectively.They also lack both the lysine and threonine/ serine residues that are required for interaction with ATP/ Mg 2+ .The DUF domains inspected also exhibit degeneration of their Walker B motifs, in which the catalytic glutamate is replaced by other hydrophobic or polar amino acids (Rosenberg et al., 2015;Zoltner et al., 2016;Wang et al., 2020).Interestingly, analysis of the MxpEccC DUF 5 and MsmEccC DUF 3 structures showed that their Walker motifs adopt a similar arrangement to that observed in MtbEccC DUF 5 , including an extended �1 helix (Fig. 4a).Analogously to MtbEccC DUF 5 , the extended �1 helix can be explained by the amino-acid variations observed at the G 2 and G 3 positions with respect to canonical ATPases, which should elicit a helical conformation compared with the P-loop structure.This structural reorganization was also found in the structural models of MtbEccC 1 a DUF , MtbEccC DUF is formed by an arginine substituting the canonical threonine/serine of the Walker A motif (Fig. 4a).Of the 15 sequences analysed, five mycobacterial DUF domains exhibit an arginine/lysine and aspartate that could potentially form a salt bridge, which is also predicted in the AlphaFold models of MtbEccC DUF 2 and MtbEccC DUF 4 , where it is likely to contribute to stabilize this nonproductive arrangement for ATP-Mg 2+ recognition (Fig. 4b).Thus, our analysis shows that the noncanonical structure observed in the MtbEccC DUF 5 Walker motifs is present in the cryo-EM structures of MxpEccC DUF 5 and MsmEccC DUF 3 , as well as in the predicted models of other DUFs from T7Sa and T7Sb systems.Together, these observations support the idea that the DUF domains are degenerated ATPase domains that exhibit a nonfunctional structure for ATP binding and thus a poor or null ATPase activity.

Hexameric model of the N-terminal region of MtbEccC 5
The lack of ATPase function of EccC DUF domains proposed here raises important questions about their role in the secretion process.The relevance of these questions is particularly stressed by site-directed mutagenesis studies conducted on MsmESX-3, in which mutations of both of the aspartate residues of the Walker B motif of MsmEccC DUF 3 to alanines (D319A and D320A) were shown to abrogate secretion (Famelis et al., 2019).This observation leads us to speculate that these mutations might either prevent proper folding of MsmEccC DUF 3 or affect protein-protein interactions that are key for secretion.Plausible DUF interactors might consist of PE/PPE effector proteins and other cytosolic ESX domains.Considering the second scenario, the ubiquitin-like domains of EccD (ULD or EccD ULD ) and DUFs of neighbouring protomers emerge as reasonable inter-protomer interactors, given their proximity in ESX complexes.Given the conservation of the ATPase-like fold of DUF, it is conceivable to expect this domain to have retained the ability to multimerize in the context of DUF-DUF interplay, as has been observed for FtsK homologues and proposed for EccC D1-D3 domains (Rosenberg et al., 2015; Miller & Enemark, 2016).This hypothesis aligns with the size-exclusion chromatography profile observed during the production of MtbEccC DUF 5 , which indicates the presence of monomeric, trimeric and hexameric species (Supplementary Fig. S3).We thus attempted to explore the interaction of MtbEccC DUF inter-protomer interactions, using AlphaFold.However, the predicted models showed unreliable and/or unfeasible interfaces (see the supporting information).This could be due to a genuine absence of these interactions, or to limitations in the modelling process arising either from the reliability of the theoretical model itself or the requirement for other domains/components of the cytosolic complex that were missing.Given these results, we decided to explore the potential multimerization of MtbEccC DUF 5 based on the experimental structures available of AAA+ enzymes in oligomeric states.To do so, we used the crystal coordinates of PrgFtsK in the hexameric state (PDB entry 2iuu) as a template to superimpose onto the N-terminal region of MtbEccC 5 , including the TM, stalk and DUF domains, using the crystal coordinates of MtbEccC DUF 5 and MtbEccC CRYO 5 (PDB entry 7npr).This resulted in an hexameric model in which the MtbEccC DUF 5 domains form a closed ring, with the stalk and TM regions located perpendicular to the DUF domains (Fig. 5a).Analysis of the interface between adjacent DUFs shows that the Walker A, Walker B and Sensor 1 motifs are located near the neighbouring monomer, but they do not participate in inter-domain contacts (Fig. 5b).However, direct interactions via residues in the �2-�2 loop, �2 and �5-�6 loop regions were identified.Interestingly, the �2-�2 region is located adjacent to the Walker A and B motifs, with which it forms direct interactions through several residues that include Arg227 and Asp328 (Fig. 5b).Based on these interactions, mutations in Arg227 and Asp328 and/or other amino acids of both Walker motifs might impact on the structure of the �2-�2 region and, by extension, on MtbEccC DUF 5 multimerization.This would also explain the lack of secretion observed in both Walker B motif mutants, D319A and D320A, of MsmEccC DUF 3 (Famelis et al., 2019), which correspond to Walker B motif residues Asp328 and Val329 in MtbEccC DUF 5 , respectively.Moreover, the residues identified at the DUF-DUF interface differ from those observed in the interaction with the EccC ULD 5 domain (Bunduc et al., 2021), which shows the hexameric model of MtbEccC DUF 5 to be compatible with EccC DUF 5 -EccC ULD 5 interplay as a key structural element in ESX complexes (Famelis et al., 2019;Bunduc et al., 2021;Beckham et al., 2021).With regard to the TM and stalk regions, these are located perpendicularly and at the edge of the ring formed by MtbEccC DUF 5 domains, with the TM regions of opposite monomers separated by a distance of �86 A ˚.This disposition notably contrasts with the EccC 3 and EccC 5 organization observed in the MsmESX-3, MtbESX-5 and MxpESX-5 structures, in which the TM regions multimerize to form a helical bundle that connects, via the stalk region, to DUF domains positioned away from each other (Famelis et al., 2019;Bunduc et al., 2021;Beckham et al., 2021;Fig. 5c).Of note, the TM helical bundle is located at the centre of the membrane region, where it contributes to a close conformation of the membrane pore.Thus, the aperture of the TM and stalk regions depicted by our model suggests a potential conformational change of the N-terminal region of MtbEccC DUF 5 that would explain how the membrane pore might open through hexamerization of the DUF domain (Fig. 5c).Given the distance measured across DUF domains in the MtbEccC DUF 5 model (�35 A ˚), and the dimensions of the periplasmic chamber observed in the MtbESX-5 complex (Bunduc et al., 2021), the proposed hexamerization could allow a pore opening with a diameter of 35-45 A ˚, which would be capable of accommodating the secreted PE-PPE heterodimers.Overall, our MtbEccC 5 model suggests that DUF hexamerization is a plausible event that might play a critical role in the aperture of the membrane pore.This multimerization would be triggered by the interplay of DUFs with the EccC D1-D3 ATPase domains, thus linking ATP hydrolysis and effector recognition to the opening of the membrane pore necessary for secretion.Certainly, the inter-protomer interplay proposed here for DUF domains constitutes a hypothesis that requires further studies to be verified, while raising additional outstanding questions that require attention.For instance, how does the coupling of ATPase hydrolysis, effector recognition and multimerization of the D1-D3 ATPase domains occur with the proposed DUF-DUF and/or DUF-ULD interplay and the aperture of the membrane pore?

Conclusions
Here, we report the crystallographic structure of the MtbEccC DUF 5 domain at high resolution, which reveals an ATP/Mg 2+ -free structure with both high degeneration and the absence of the cis-and trans-acting motifs required for the binding and hydrolysis of ATP, respectively.Among the most remarkable features, we find the absence of the catalytic glutamate required for nucleophilic attack on ATP and a noncanonical fold of the Walker A motif.Instead of the typical P-loop conformation, the Walker A motif adopts an extended �-helical arrangement that would clash with the nucleotide.Although the overall conservation of this arrangement is also observed in MtbEccC CRYO 5 , our crystal structure provides previously unseen and unambiguously defined interactions that shape the observed noncanonical ATP-binding site.Moreover, our structural analysis of MtbEccC DUF 5 has led us to identify the structural features that underlie this arrangement.These include the variations of G 2 and G 3 observed in the Walker A motif of MtbEccC DUF 5 , which introduce geometrical restraints that favour the extended �1 helical structure instead of the canonical P-loop conformation.We also demonstrated the lack of interaction of MtbEccC DUF 5 with the nucleotide in solution using DSF and ITC, which aligns with our futile attempts to co-crystallize this domain with ATP/Mg 2+ and further supports the inability of the DUF domain to bind ATP.Our study also shows that the degenerated ATP-binding site observed in MtbEccC DUF 5 is present in the experimental and predicted structures of other homologous domains from T7Sa and T7Sb systems, which supports the notion that DUFs are degenerated ATPase domains that are nonfunctional in nucleotide interaction/hydrolysis.DUF domains emerge, however, as pivotal elements in the secretion process, as illustrated by mutagenesis studies of the Walker B motif of MsmEccC DUF 3 that led to the abrogation of MsmESX-3 secretion.Putting all of these observations together, we speculate on the role of DUFs as structural elements involved in inter-protomer interplay, including a possible hexamerization of this domain with potential implications for the opening of the membrane pore during the secretion process.This hypothesis, which requires further studies, opens new important questions such as how ATPase hydrolysis, effector recognition and multimerization of the D1-D3 ATPase domains would couple with the proposed hexamerization of DUF domains and the aperture of the membrane pore.

Figure 1 (
Figure 1 (a) MtbESX-5 model based on reported cryo-EM structures, showing the proposed conformational alteration of EccC 5 between an open/extended conformation (left) and a close/contracted conformation (right).(b) Domain architecture of EccC 5 consisting of the transmembrane (yellow), stalk (grey) and DUF (green) domains followed by three ATPase domains, D1, D2 and D3 (top), where the DUF and D1-D3 domains are interconnected by connectors referred to as Linker1, Linker2 and Linker3.(c) Superimposition of the SrsEssC D3 crystal structure (PDB entry 6vt1, blue) with cryo-EM models of MtbEccC DUF

Figure 2 (
Figure 2 (a) A general view of the MtbEccC DUF 5 crystal structure represented as green cartoons is shown (left), as well as details of the putative ATP-binding site (right).Salt-bridge and hydrogen-bond interactions are indicated as dashed lines in blue and black, respectively, as well as the bond distances between Arg227 and Asp328.The Walker A, Walker B and Sensor 1 motifs are coloured blue, pink and orange, respectively, showing key residues (sticks) and water molecules (red spheres).A scheme with details of the regions of MtbEccC DUF 5 included in the crystallographic construct is shown.(b) Sequence alignment of the Walker A and Walker B motifs of MtbEccC DUF 5 with the canonical ATPase domains of PrgFtsK, SslHerA, PfrMCM, MsdVps4 and MtbEccC D1 1À 5 , showing the consensus sequence at the top.Key amino acids involved in nucleotide interaction/hydrolysis are highlighted in blue and pink, respectively.(c) Superimposition of the MtbEccC DUF 5 and PrgFtsK (PDB entry 4r7y, beige) structures represented as cartoons.The Walker A, Walker B and Sensor 1 motifs are coloured light blue, pink and orange, respectively, in MtbEccC DUF 5 and in dark teal, purple and dark orange, respectively, in PrgFtsK.In PrgFtsK, the magnesium cation is represented as a green sphere, while ATP (in black) and residues key to the interaction with the nucleotide are represented as sticks.The Arg227-Asp328 salt bridge observed in MtbEccC DUF 5 and the hydrogen bonds involved in ATP/Mg 2+ stabilization in PrgFtsK are represented as dashed lines in blue and black, respectively.(d) Detail of the Walker A motif in MtbEccC DUF 5 and PrgFtsK, showing the favourable dihedral angleC i -N i+1 -C � iþ1 -C � iþ1observed between Glu228 and Gln229 in the former compared with the nonfavourable dihedral angle generated between Ser470 and Gly471 of PrgFtsK when the glycine is substituted by an alanine.

Figure 3 (
Figure 3 (a) Superimposition of the MtbEccC DUF 5 (green) and SrsEssC D3 (PDB entry 6vt1, dark blue) crystal structures showing details of their degenerated ATP-binding sites.Residues occupying key positions for nucleotide/magnesium interaction and hydrolysis are labelled and represented as sticks.The Walker A, Walker B and Sensor 1 motifs of MtbEccC DUF 5 are coloured following the same colour code as in Fig. 2. (b) Superimposition of MtbEccC DUF 5 DSF profiles in the absence (control) and presence of ADP-AlF 3 (4 mM) and Mg 2+ (5 mM).The failure of ADP-AlF 3 /Mg 2+ to stabilize the domain structure against thermal denaturation was consistent with a lack of nucleotide binding.(c) ITC titration of ATP (1.3 mM) into EccC DUF other DUF domain structures predicted by AlphaFold, all of which present the same extended conformation of �1 (Fig.4and Supplementary Fig.S2).Inspection of the Walker motifs in the MxpEccC DUF EM also reveal an arginine-aspartate salt bridge connecting �1 to the C-terminus of �3 (Fig.4b).The salt bridge observed in MxpEccC DUF 5 involves an arginine at the N-terminus of �1, analogously to as in MtbEccC DUF 5 , while the salt bridge in MsmEccC DUF 3

Figure 4
Figure 4 Analysis of a putative ATP/Mg 2+ -binding site in EccC DUF domains of ESX-1-ESX-5 systems from Mtb, Msm and Mxp.(a) Sequence alignment of the Walker A and B motifs (coloured blue and pink, respectively).(b) Structural details of the Walker A and B motifs observed in MtbEccC DUF

Figure 5 (
Figure 5 (a) Structural comparison of the EccC 5 arrangement observed in the MtbESX-5 structure (PDB entry 7npr) with our model of the N-terminal region of MtbEccC 5 multimerizing as a hexamer.A surface representation of MtbEccC 5 molecules is shown, with the transmembrane, stalk and DUF domains coloured yellow, grey and green, respectively.The Walker A, Walker B and Sensor 1 motifs are coloured cyan, pink and orange, respectively.An enlarged view showing the molecular interface between adjacent DUF monomers in our model is shown in the box on the right, where identified contact regions are highlighted in yellow.(b) Details of the interface between adjacent DUF1 (grey) and DUF2 [using the colour code used in (a)] molecules are shown, with residues establishing contacts between the monomers and the Walker A and B motifs represented as sticks; their corresponding interactions are indicated as dashed lines.Water molecules are represented as red spheres.(c) Model showing the aperture of the TM region of MtbEccC 5 associated with the proposed multimerization of MtbEccC DUF (Jumper et al., 2021)rimposition in PyMOL (version 2.0; Schro ¨dinger) using experimental coordinates deposited in the Protein Data Bank and models predicted by AlphaFold(Jumper et al., 2021)available at UniProt.The ATPases used in sequence align-UniProt entry A0A656Z0M6) and YukB from Bacillus subtilis (BsbYukB; UniProt entry C0SPA7).Structural analysis of the dihedral angles and their occurrence in the Cambridge Crystallographic Data Centre (CCDC) database was performed using Mogul (version 2020.3.0).Alpha- (Jumper et al., 2021;Evans et al., 2021s et al., 2021) was employed to explore potential protein-protein interactions

Table 1 X
-ray crystallographic statistics for MtbEccC DUF 5 .Values in parentheses are for the highest resolution shell.